本文提出了正则化的多任务学习框架SpaceFusion,通过结构化隐变量空间联合优化多样性和相关性。NAACL2019
Introduction
本文研究的是对话生成问题,传统的seq2seq模型往往会生成平淡通用的回复,为了提高生成回复的多样性和相关性,大致有两类工作:
- Decoding/ranking:仅在预测解码的时候优化,通过上下文相关信息对beam search的结果进行重排序。缺点是需要很大的beam size。_A diversity-promoting objective function for neural conversation models_
- Training/latent space:使用CVAE来建模discourse-level的多样性。缺点是损失了回复的相关性(在没有额外的dialogue act的情况下)。_Learning discourse-level diversity for neural dialog models using conditional variational autoencoders_
本文的思路是在训练的时候联合优化多样性和相关性,通过对齐下面两个模型:
- Sequence-to-Sequence(S2S): latent vector of context
- Autoencoder(AE): latent vectors of multiple possible diverse responses
一种简单的方式是多任务学习:
但这种方法的缺点在于很难对齐两种隐变量空间:
因此,本文提出了一种几何的方法SPACEFUSION,得到结构化的隐变量空间,使得预测回复的距离和方向分别代表相关性和多样性,如下图所示:
The SPACEFUSION Model
给定数据集 $\mathcal{D}=\left[\left(x_{0}, y_{0}\right),\left(x_{1}, y_{1}\right), \cdots,\left(x_{n}, y_{n}\right)\right]$,$x_{i}$ $y_{i}$ 分别代表上下文和回复,模型目标是生成相关且多样性的回复。
SPACEFUSION核心是两个正则化的loss:
- pull S2S and AE dots closer to each other:
实验中d是欧氏距离。 encourage a smooth transition between S2S and AE:
Finally combine them with vanilla multi-task loss:
Inference:预测时,从半径$|r|$(超参数)随机采样r,以$z(x,r)$解码端GRU的初始状态,采用greedy decoding
$$
z(x, r)=z_{\mathrm{S} 2 \mathrm{S}}(x)+r
$$
Structured latent space
The regularization terms induce some desired structure of the latent space: Semantic -> Geometry
- Diversity -> direction: as $L_{interp}$ regularized semantic along a line
- Relevancy -> distance: as $L_{fuze}$ regularized distance
Direction & diversity
SpaceFusion tend to map different possible responses to different direction
Interpolation & smoothness
Experiments
Conclusion
本文提出了正则化的多任务学习框架SpaceFusion,通过结构化隐变量空间联合优化多样性和相关性。